False Alarm Reduction Techniques for ASR and OCR
نویسنده
چکیده
Techniques which may reduce the false alarm rate of sliding networks employed in ASR (automatic speech recognition) and OCR (optical character recognition) problems are reviewed. Such techniques include the use of hypersphere classifiers, training out of negatives, addition of garbage units, and exploitation of multiple networks. The combination of several techniques may reduce the false alarm rate to tolerable levels.
منابع مشابه
BBN VISER TRECVID 2012 Multimedia Event Detection and Multimedia Event Recounting Systems
We describe the Raytheon BBN Technologies (BBN) led VISER system for the TRECVID 2012 Multimedia Event Detection (MED) and Recounting (MER) tasks. We present a comprehensive analysis of the different modules in our evaluation system that includes: (1) a large suite of visual, audio and multimodal low-level features, (2) modules to detect semantic scene/action/object concepts over the entire vid...
متن کاملVariable-Span out-of-vocabulary named entity detection
Out-of-vocabulary named entities (OOV NEs) are always misrecognized by fixed-vocabulary automatic speech recognition (ASR) systems. This has a negative impact on downstream applications such as language understanding and machine translation (MT). Automatic detection of OOV NEs in ASR hypotheses can help mitigate this problem by triggering the use of alternative approaches to acquire and process...
متن کاملRescoring Hypothesized Detections of Out-of-Vocabulary Keywords Using Subword Samples
Rescoring hypothesized detections, using keyword’s audio samples extracted from training data, is an effective way to improve the performance of a Keyword Search (KWS) system. Unfortunately such rescoring framework cannot be applied directly to Out-of-Vocabulary (OOV) keywords since there is no sample in the training data. To address this limitation, we propose two techniques for OOV keywords i...
متن کاملA comparison of multiple methods for rescoring keyword search lists for low resource languages
We review the performance of a new two-stage cascaded machine learning approach for rescoring keyword search output for low resource languages. In the first stage Confusion Networks (CNs) are rescored for improved Automatic Speech Recognition (ASR) by reranking the arcs of each confusion bin. In the second stage we generate keyword search hypotheses from the rescored ASR output and rescore them...
متن کاملConfidence Prediction for Lexicon-Free OCR
Having a reliable accuracy score is crucial for real world applications of OCR, since such systems are judged by the number of false readings. Lexicon-based OCR systems, which deal with what is essentially a multi-class classification problem, often employ methods explicitly taking into account the lexicon, in order to improve accuracy. However, in lexicon-free scenarios, filtering errors requi...
متن کامل